Batched 2nd Order + HeavySOAP by ClashLuke · Pull Request #92 · HomebrewML/HeavyBall

ClashLuke · 2026-05-09T15:35:12Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c85ce08e94

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-09T15:38:26Z

+            bucket_state = states[indices[0]].setdefault(bucket_key, {})
+            for i in indices[1:]:
+                states[i][bucket_key] = bucket_state


Don't reuse bucket state when the active set changes

When same-shaped parameters share bucket_state here, the tensors inside that state are sized by the current bucket's leading dimension. Since _step_inner builds param from split_p_and_g_in_group(..., skip_none=True), any parameter with grad is None drops out of the bucket for that step; reusing the old shared state then either shifts later parameters onto the wrong slot or leaves exp_avg/GG with a leading dimension from a previous active set, causing shape errors in the SOAP/PSGD updates. This affects normal models with intermittently unused parameters, so the bucket key/state needs to be stable for the exact active members or handle resizing/masking.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b4a4652d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-11T11:57:52Z

+            for i in indices[1:]:
+                states[i][bucket_key] = bucket_state
+
+            result = self.fn([bucket_state], group, [slab_u], [slab_g], [slab_p], *args, **kwargs)


Preserve HVP temporaries when bucketing PSGD

When PSGD/PSGDPRO are used with hessian_approx=True and same-shaped parameters are bucketed, the HVP closure stores vector/hessian_vector temporaries under each original parameter view, but this call passes a freshly stacked slab_p into the inner PSGD chain. _update_psgd_precond() later looks up get_temporary(group, param) using that slab tensor identity, so the lookup misses and it silently falls back to dampen_grad(...) instead of using the computed HVPs, fitting the wrong preconditioner for these runs.

Useful? React with 👍 / 👎.

ClashLuke added 7 commits May 7, 2026 15:55

bucket2

05c57e5

clean up skipupdate

268cd7a

also rotate exp avg sq

99c2fef

benchmark and clean up

b27753c

readd soap

46eb010

ruff

fbfa3db

bump

c85ce08

chatgpt-codex-connector Bot reviewed May 9, 2026

View reviewed changes

ClashLuke added 3 commits May 10, 2026 14:31

handle squeeze

6ed315d

clean up

75d79fa

fix adopt sequence of operations

7b4a465

chatgpt-codex-connector Bot reviewed May 11, 2026

View reviewed changes

ClashLuke added 3 commits May 12, 2026 09:12

simplify buckets, fix sam

7495f1e

simplify buckets

387d6d4

ruff

acf4917

ClashLuke merged commit 24fce43 into main May 13, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batched 2nd Order + HeavySOAP#92

Batched 2nd Order + HeavySOAP#92
ClashLuke merged 13 commits into
mainfrom
bucket2

ClashLuke commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ClashLuke commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant